working  paper 
department 
of  economics 


Efficient  Estimation  and  Identification  of 
Simultaneous  Equation  Models 
with  Covariance  Restrictions 
Jerry  Hausman 
Whitney  Newey 
William  Taylor 
Number  331  Revised  October  1983 


massachusetts 
institute  of 

technology 

50  memorial  drive 
Cambridge,  mass. 02139 


J 


Efficient  Estimation  and  Identification  of 
Simultaneous  Equation  Models 
with  Covariance  Restrictions 
Jerry  Hausman 
Whitney  Newey 
William  Taylor 
Number  331  Revised  October  1983 


Digitized  by  the  Internet  Archive 

in  2011  with  funding  from 

Boston  Library  Consortium  IVIember  Libraries 


http://www.archive.org/details/efficientestimatOOhaus 


'Efficient  Estimation  and  Identification  of  Simultaneous  Equation 
Models  with  Covariance  Eestrictions/ 


Jerry  Hausman 


Whitney  Newey 
Princeton 

William  Taylor 
ATT 


Revised  October  1985 


Hausman  and  Newey  thank  the  NSF  for  research  support.   This  paper  is  a 
revision  of  an  earlier  version  presented  at  the  European  Econometric 
Meetings,  1982.  F.  Fisher  and  T.  Rothenberg  have  provided  useful 
comments. 


m.ll.  LieRARIES      I 

NOV     8 1985 


This  paper  is  offered  in  recognition  of  the  fiftieth  birthday  of  the 
Cowles  Foundation. 


1 .  Introduction 

In  the  pioneering  research  in  econometrics  done  at  the  Cowles 
Foundation,  estimation  technqiues  for  simultaneous  equations  models  were 
studied  extensively.  Maximum  likelihood  estimation  methods  were  applied  to 
both  the  single  equation  case  (LIKL)  and  to  the  complete  simultaneous 
equations  models  (FIML).   It  is  interesting  to  note  that  while  questions  of 
identification  were  completely  solved  for  the  case  of  coefficient 
restriction,  the  problem  of  identification  with  covariance  restrictions 
remained.  Further  research  by  Fisher  (1966),  Rothenberg  (1971 ),  and  Wegge 
(1965)  advanced  our  knowledge  in  this  field,  with  Fisher's  examination  of 
certain  special  cases  especially  insightful.  In  a  companion  paper,  Hausman 
and  Taylor  (1983),  we  give  conditions  in  terms  of  the  interaction  of 
restrictions  on  the  disturbance  covariance  matrix  and  restrictions  on  the 
coefficients  of  the  endogenous  variables  for  the  identification  problem. 
What  is  especially  interesting  about  our  solution  is  that  covariance 
restrictions,  if  they  are  to  be  effective  in  aiding  identification,  must  take 
one  of  two  forms.  First,  covariance  restrictions  can  cause  an  endogenous 
variable  to  be  predetermined  in  a  particular  equation,  e.g.,  a  "relatively 
recursive"  specification.  Or,  covariance  restrictions  can  lead  to  an 
estimated  residual  from  an  identified  equation  being  predetermined  in  another 
equation.  Both  of  these  forms  of  identification  have  ready  interpretations 
in  estimation  as  instrumental  variable  procedures,  which  links  them  to  the 
situation  where  only  coefficient  restrictions  are  present. 


For  full  information  maximum  likelihood  (FIML),  the  Cowles 
Foundation  research  considered  the  case  of  covariance  restrictions  when 
the  covariance  matrix  of  the  residuals  is  specified  to  he  diagonal: 
Koopmans,  Rubin,  and  Leipnik  (1950,  pp.  154-211).  The  case  of  a  diagonal 
covariance  matrix  is  also  analyzed  hy  Malinvaud  (1970,  pp.  678-682)  and 
hy  Rothenberg  (1973,  pp-  77-79  and  pp.  94-115),  who  also  does  FIML 
estimation  in  two  small  simultaneous  equations  models  to  assess  the  value 
of  the  covariance  restrictions.  But  covariance  restrictions  are  a 
largely  unexplored  topic  in  simultaneous  equations  estimation,  perhaps 
because  of  a  reluctance  to  specify  a  priori  restrictions  on  the 
disturbance  covariances.^   However,  an  important  contributory  cause  of 
this  situation  may  have  been  the  lack  of  a  simple,  asymptotically 
efficient,  estimation  procedure  for  the  case  of  covariance  restrictions. 
Rothenberg  and  Leenders  (1964),  in  their  proof  of  the  efficiency  of  the 
Zellner-Theil  (1962)  three  stage  least  squares  (3SLS)  estimator,  showed 
that  the  presence  of  covariance  restrictions  would  make  FIML 
asymptotically  more  efficient  than  3SLS.  The  Cramer-Rao  asymptotic  lower 
bound  is  reduced  by  covariance  restrictions,  but  3SLS  does  not  adequately 
account  for  these  restrictions.  The  reason  for  this  finding  is  that 
simply  imposing  the  restrictions  on  the  covariance  matrix  is  not  adequate 


^  .   Of  course,  at  a  more  fundamental  level  covariance  restrictions  are 
required  for  any  structural  estimation  in  terms  of  the  specification  of 
variables  as  exogenous  or  predetermined,  c.f.  Fisher  (1966,  Ch.4)« 


when  endogenous  variables  are  present  because  of  the  lack  of  block- 
diagonality  of  the  information  matrix  between  the  slope  coefficients  and 
the  unknown  covariance  parameters.   In  fact,  imposing  the  covariance 
restrictions  on  the  5SLS  estimator  does  not  improve  its  asymptotic 
efficiency.  Thus  efficient  estimation  seemed  to  require  FIML.^ 
The  role  of  covariance  restrictions  in  establishing  identification  in  the 
simultaneous  equations  model  was  not  fully  understood,  nor  did  imposing 
such  restrictions  improve  the  asymptotic  efficiency  of  the  most  popular 
full  information  estimator.  Perhaps  these  two  reasons,  more  than  the 
lack  of  a  priori  disturbance  covariance  restrictions  may  have  led  to 
their  infrequent  use. 

Since  our  identification  results  have  an  instrumental  variable 
interpretation,  it  is  natural  to  think  of  using  them  in  a  3SLS-like 
instrumental  variables  estimator.  Madansky  (1964)  gave  an  instrumental 
variable  interpretation  to  3SLS  and  here  we  augment  the  3SLS  estimator  by 
the  additional  instruments  which  the  covariance  restrictions  imply.  That 


*.  Rothenberg  and  Leenders  (1964)  do  propose  a  linearized  maximum 
likelihood  estimator  which  corresponds  to  one  Newton  step  beginning  from 
a  consistent  estimate.  As  usual,  this  estimator  is  asymptotically 
equivalent  to  FIML.  Also,  an  important  case  in  which  covariance 
restrictions  have  been  widely  used  is  that  of  a  recursive  specification 
in  which  FIML  coincides  with  ordinary  least  squares  (OLS). 


is,  in  an  equation  where  the  covariance  restrictions  cause  a  previously 
endogeneous  variable  to  be  predetermined,  we  use  the  variable  as  an 
instrument  for  itself  -  if  it  is  included  in  the  equation  -  or  just  as  an 
instrument  if  it  is  not  included.   In  the  alternative  case,  we  use  the 
appropriate  estimated  residuals  from  other  identified  equations  to  form 
instruments  for  a  given  equation.  This  estimator  which  we  call  the 
augmented  three  stage  least  squares  estimator  (A3SLS)  is  shown  to  be 
more  efficient  than  the  3SLS  estimator  when  effective  covariance 
restrictions  are  present. 

To  see  how  efficient  the  A3SLS  estimator  is,  we  need  to  compare  it 
to  FIML  which  takes  account  of  the  covariance  restrictions.   Hausman 
(1975)  gave  an  instrumental  variable  interpretation  of  FIML  when  no 
covariance  restrictions  were  present,  which  we  extend  to  the  case  with 
covariance  restrictions.  The  interpretation  seems  especially  attractive 
because  we  see  that  instead  of  using  the  predicted  value  of  the 
endogenous  variables  based  only  on  the  predetermined  variables  from  the 
reduced  form  as  instruments,  when  covariance  restrictions  are  present, 
FIML  also  uses  that  part  of  the  estimated  residual  from  the  appropriate 
reduced  form  equation  which  is  uncorrelated  with  the  residual  in  the 
equation  where  the  endogenous  variables  are  included.  Thus  more 
information  is  used  in  forming  the  instruments  than  in  the  case  where 


covariance  restrictions  are  absent.  More  importantly,  the  instrumental 
variable  interpretation  of  FIML  leads  to  a  straightforward  proof  that  the 
A3SLS  estimator  is  asymptotically  efficient  with  respect  to  the  FIML 
estimator.  The  A3SLS  estimator  provides  a  computationally  convenient 
estimator  which  is  also  asjrmptotically  efficient.  Thus  we  are  left  with 
an  attractive  solution  to  both  identification  and  estimation  of  the 
traditional  simultaneous  equations  model.   Identification  and  estimation 
both  are  closely  related  to  the  notion  of  instrumental  variables  which 
provides  an  extremely  useful  concept  upon  which  to  base  our  understanding 
of  simultaneous  equations  model  specifications. 

In  addition  to  the  development  of  the  A3SLS  estimator,  we  also 
reconsider  the  assignment  condition  for  identification  defined  by  Hausman 
and  Taylor  (1985).  ¥e  prove  that  the  assignment  condition  which  assigns 
covariance  restrictions  to  one  of  the  two  equations  from  which  the 
restriction  arises  provides  a  necessary  condition  for  identification. 
The  rank  condition  provides  a  stronger  necessary  condition  than  the 
condition  of  Fisher  (1966).  These  necessary  conditions  apply  equation  by 
equation.  ¥e  also  provide  a  sufficient  condition  for  identification  in 
terms  of  the  structural  parameters  of  the  entire  system.   Lastly,  we 
provide  straightforward  specification  tests  for  the  covariance 
restrictions  which  can  be  used  to  test  non-zero  covariances. 


2.  Estimation  in  a  Two  Equation  Model 

¥e  begin  with  a  simple  two  equation  simultaneous  equation  model  with 
a  diagonal  covariance  matrix,  since  many  of  the  key  results  are  straight- 
forward to  derive  in  this  context.   Consider  an  industry  supply  curve 
which  in  the  short  run  exhibits  decreasing  returns  to  scale.   Quantity 
demanded  is  thus  an  appropriate  included  variable  in  the  supply  equation 
which  determines  price,  y. ,  as  a  function  of  quantity  demanded,  y^.  Also 
included  in  the  specification  of  the  supply  equation  are  the  quantities 
of  fixed  factors  and  prices  of  variable  factors,  both  of  which  are 
assumed  to  be  exogenous.   The  demand  equation  has  price  as  a  jointly 
endogenous  explanatory  variable  together  with  an  income  variable  assumed 
to  be  exogenous.  ¥e  assume  the  covariance  matrix  of  the  residuals  to  be 
diagonal,  since  shocks  from  the  demand  side  of  the  market  are  assumed  to 
be  fully  captured  by  the  inclusion  of  y_  in  the  supply  equation.   The 
model  specification  in  this  simple  case  is 

(2.1)  ^1  =Pl2y2-^Yl1^l  "^1 

(2.2)  ^2  =  ^21^1  "^22^2"  ^2 

where  we  have  further  simplified  by  including  only  one  exogenous  variable 
in  equation  (2.1).  ¥e  assume  that  we  have  T  observations  so  that  each 
variable  in  equations  (2.1)  and  (2.2)  represents  a  T  X  1  vector.  The 


7 
stochastic  assumptions  are  E(e.|z.  ,Zp)  =  0  for  i=1,2,  var(e.|z  ,z„)  = 

''ii'  "^  12"°°''^'^  1^2'  ^V'^2^   "  °'  ' 

Inspection  of  equations  (2.1 )  and  (2.2)  shows  that  the  order 
condition  is  satisfied  so  that  each  equation  is  identified  by  coefficient 
restrictions  alone,  so  long  as  the  rank  condition  does  not  fail.   If  the 
covariance  restriction  is  neglected,  each  equation  is  just-identified  so 
that  5SLS  is  identical  to  2SLS  on  each  equation.   Note  that  for  each 

A 

equation,  2SLS  uses  the  instruments  W.  =  (Z  11  .  z.),   i*j,  where  Z  = 

1       J  J- 

A 

(z.z„)  and  n  .  is  the  vector  of  reduced  form  coefficients  for  the  (other) 

•^  3 

included  endogenous  variables.   To  see  how  FIML  differs  from  the 

instrumental  variables  (IV)  estimator,  we  solve  for  the  first  order 

conditions  of  the  likelihood  function  under  the  assumption  that  the  c.'s 

are  normally  distributed.   Of  course,  as  is  the  case  for  linear 

simultaneous  equation  estimation  with  only  coefficient  restrictions, 

failure  of  the  normality  assumption  does  not  lead  to  inconsistency  of  the 

estimates.  For  the  two  equation  example,  the  likelihood  function  takes 

the  form 

(2.3)        L  =  c  --|-  log  (0^^022)  +  T  log  11-^^2^21  I 


2 


^i~  (yrVi^'^^r^i^i^  ^  5~  (y2-^2*2^'^y2~V2^^ 


11  "22 


where  c  is  a  constant  and  the  X. 's  and  6.'s  contains  the  right  hand  side 


Of  course,  because  of  the  condition  of  just  identification,  a 
numerically  identical  result  would  be  obtained  if  instruments 
W.  =   (z.  ,z-)  were  used. 


8 

variables  and  unknown  coefficients  respectively,  e.g.,  X  =  (y  z.)  and 

To  solve  for  the  FIML  estimator,  we  find  the  first  order  conditions 
for  equation  (2.3);  results  for  the  second  equation  are  identical.  The 
three  first  order  conditions  are 


1P 


(2.4a)- .To^  +  -!-  (y.-  xaJVo  =  0 


(2.4b)  r-(yr  ^i^i^'^i  =  0 


11 


(2.4e)     __^._i_(y^_.X^6,)'(y^~X^6^)  =0 


Rearranging  equation  (2.4c)  yields  the  familiar  solution  for  the 
variance,  <?..=  (l/T)(y.-  X.6)'(y,-  X.6).  Equation  (2.4b)  has  the  usual 
OLS  form  which  is  to  be  expected  since  z.  is  an  exogenous  variable.  It  is 


equation  (2.4a)  where  the  simultaneous  equations  nature  of  the  model 
appears  with  the  presence  of  -{Dp  . -/(l-p  .pp^.  )  which  arises  from  the 
Jacobian  term  in  the  likelihood  function:  see  Hausman(l975)« 


Now  the  first  order  conditions  for  equations  (2.4a)  -  (2.4c)  can  be 
solved  by  numerical  methods  which  maximize  the  likelihood  function. 
Koopmans  et  al.,  (1950)  have  a  lengthy  discussion  of  various  numerical 
technques  for  maximization  which  must  be  one  of  the  earliest  treatments 
of  this  problem  in  the  econometrics  literature.   But  we  can  solve 


9 

equation  (2.4a)  in  a  particular  way,  using  the  reduced  form  specification 
to  see  the  precise  role  of  the  covariance  restrictions  in  the  model.  ¥e 
first  multiply  equation  (2.4a)  hja..    to  get 


+y2'(yr^iSi)  =  o< 


We  now  do  the  substitutions  from  the  reduced  form  equations  y.  =211.+   v. 

Ill 

using  the  fact  that  v-  =  Ppi^  1'''^^"^  1^21  ^  *  ^2'''^^"^  1^21  ^'  ^®  transform 

equation  (2.5)  to 


(2-6)(i4;^^/y2-V2^-2^'  V¥i) 


+  (21 2+  v^)'  (y^-X^6^)  =  0. 


Canceling  terms,  we  find  the  key  result 


(2.7)        (zn,*  T-r-^  y^^  =  0- 

2   1-^12^21   ^ 


Without  the  covariance  restriction,  we  would  have  the  result 


(2.8)   (Zn2)'e,=0,     n„  =(-^J?F^    ,1^\     )' 
2   1  2   ^  1-^12^21    ^-^12^21 


10 

which  is  the  instrumental  variable  interpretation  of  FIML  given  by 
Hausman  (1975),  equation  (12).  But  in  equation  (2.7),  we  have  the 
additional  term  c^/{]-^ .^^^.) .     What  has  happened  is  that  FIML  has  used 
the  covariance  restrictions  to  form  a  better  instrument.  Remember  that 
y_  forms  the  best  instrument  for  itself  if  it  is  predetermined  in 
equation  (2.1).  But  here,  jp  is  jointly  endogenous  since  it  is 
correlated  with  e .  :  from  the  reduced  form  equation 


^2       ^21^1 


2   1-^12^21     ^"^12^21 
FIML  cannot  use  the  last  term  in  forming  the  instrument  for  y.  since 

pp.E ./(l-p .pPp. )  is  correlated  with  the  residual  e.  in  equation  (2.1). 

It  is  this  last  term  which  makes  yp  endogenous  in  the  first  equation. 

However,  FIML  can  use  ep/(1-p  .^^^p, )  because  ep  is  uncorrelated  with  e.  by 

the  covariance  restriction  a. p=  0.  By  using  this  term,  FIML  creates  a 

better  instrument  than  ordinary  2SLS  which  ignores  ep/(l— p,p3p. ).  Our 

two  equation  example  makes  it  clear  why  3SLS  is  not  as3nnptotically 

efficient  when  covariance  restrictions  are  present.  FIML  uses  better 

instruments  than  3SLS  and  produces  a  better  estimate  of  the  included 

endogenous  variables. 

Two  other  important  cases  can  be  examined  with  our  simple  two 

equation  model.  First,  suppose  that  ? ^.    -   0.'  The  specification  is  then 

triangular,  and  given  the  diagonal  covariance  matrix,  the  model  is 

recursive.  Here,  the  FIML  instrument  is  aip  '*'  ^o  ^  ^2   ^°  *^^*  y'o^'^ 


II 

predetermined  and  FIML  becomes  OLS  as  expected.  The  second  case  returns 
to  p.»  *0  but  sets  Y  =0.  The  first  equation  is  no  longer  identified  by 
coefficient  restrictions  alone,  but  it  is  identified  by  the  covariance 
restrictions  because  the  FIML  instruments  are 

(2.9)        w^  -[zn^^^^^.z^]. 

Because  of  the  addition  of  the  residual  term  in  W. ,  the  instrument  matrix 
has  full  rank  and  the  coefficients  can  be  estimated.  Both  of  these 
estimation  results  arise  because  we  have  restrictions  on  the  matrix  B  Z , 
where  B  is  the  matrix  of  all  coefficients  of  endogenous  variables  and  S 
is  the  disturbance  covariance  matrix.  The  estimation  results  are  closely 
connected  with  the  identification  results  when  covariance  restrictions 
are  present,  see  Lemma  3,   Proposition  6  of  Hausman  and  Taylor  (1982),  and 
Section  5« 

FIML  needs  to  be  iterated  to  solve  the  first  order  conditions;  in 
our  two  equation  case,  we  see  that  the  original  first  order  condition 
(2.4a)  or  its  tranformed  version,  equation  (2.7),  is  nonlinear. 
Computational  considerations  when  estimating  FIML  in  this  form  are 
discussed  in  Hausman  (1974).  But  we  know  if  the  covariance  restrictions 
were  not  present,  3SLS  (or  here  2SLS)  gives  asymptotically  efficient 
instruments.  Since  3SLS  is  a  linear  IV  estimator,  it  is  straightforward 
to  compute  and  is  included  in  many  econometric  computer  packages.  Yet  we 
also  know  that  if  a.-  =  0  then  5SLS  is  not  asjnnptotically  efficient. 
Furthermore,  if  Pp/^  ^^^  Y  p?  ~^'   ^*  would  not  be  clear  how  to  do  2SLS  on 


12 

the  first  equation,    since  it  is  not  itjentified  by  the  coefficient 

A 

restrictions  alone.  If  we  had  to  use  the  2SLS  instruments  y  (  =  Z  11-) 
and  z. ,  the  instrument  matrix  (W  'X.)  would  be  singular  as  expected  since 
the  rank  condition  fails.  The  FIML  solution  which  accounts  for  the 
covariance  restriction  a .^  Q>     is  very  suggestive.  Suppose  for  the  first 
equation  in  the  specification  of  equations  (2.1)  and  (2.2)  Ep=  y^-X„6  is 
used  as  an  instrument  in  addition  to  z.and  z-,  as  is  the  case  for  FIML. 
It  follows  immediately  that  the  optimal  estimator  which  use  z.,z_,  and  e^ 
as  instruments  for  the  first  equation  and  z  and  z-  as  instruments  for 
the  second  equation,  which  we  call  augmented  3SLS  or  A3SLS,  is 
asymptotically  more  efficient  than  ordinary  3SLS  since  it  uses  more 
instruments.   But  the  important  question  is  whether  A3SLS  is 
asymptotically  equivalent  to  FIML,  as  it  is  without  covariance 
restrictions.  We  have  accounted  for  all  the  restrictions  in  the  model 
by  adding  e„  as  an  instrument  for  the  first  equation.  A3SLS  differs 
from  FIML  because  it  replaces  efficient  estimates  of  parameters  in  the 
instrument  matrix  with  inefficient  estimates.  Unlike  the  case  of  only 
coefficient  restrictions,  this  replacement  does  affect  the  asymptotic 
distribution  of  the  coefficient  estimator.  However,  this  replacement  is 
corrected  for  by  the  optimal  IV  estimator  in  such  a  way  that  A3SLS  is 
asymptotically  equivalent  to  FIML. 

To  investigate  the  asymptotic  properties  of  the  estimators  in  this 
case,  we  first  calculate  the  information  matrix  for  the  example  of 
equations  (2.1)  and  (2.2).  We  denoted  d  =  {\-^       p   ),let  6  =  (Pip'Yii' 
^21'"'' 22^''  ^°**  °   ^^^W  ^22)'      ^^"^   ^  denote  lim(l/T)Z'Z,e^  =  (l  ,0) ' ,  e^ 
«(0,1)',  and  D.  =  (n.,e.),  i^tj.  The  lower  triangle  of  the  information 


matrix  is 
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(2.10) 


J  = 


•^11  '^2 

•^21  "^22 


US-^T- 


2 

-5  L 

d6d6' 

8^L   

d65a'  dada' 


a^L 


where 
^21 


2  p 


21  *  ''22/'' 11 


e^e;  -D^'MD^/a^, 


12  ^1^ 


^21 

a,^d  1  1 


11 


^Pl2  "^11/^22 

d^   ^1^ 


'12 


0221 


r^^2™2/^22 


^2^1 


and 


22 


-1/(2JiP 


0 

-1/(2022) 


*  11 

We  now  compute  the  covariance  matrix  for  6  using  the  formula  J  = 


^'^11~*^12"^22'^21  ^~  *°  ^^^^ 


(2.11)  Var(6)  =  J^  ^ 


-^S^^i^*^^r«^i/^ii 


:2  ^1^1 


1 


^  ^  ^1 


-Il-f^e,e/  .D2'ra)2/a22 
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It  is  interesting  to  note  that  without  covariance  restrictions  we 
would  have 


(2.12)  Var(6)  = 


°r«  ^1/^11 


^2'«^2/^22 


BO  that  the  covariance  restrictions  have  produced  a  more  efficient 
estimator  via  the  addition  of  the  positive  semi-definite  matrix 
(l/d^)F'P  to  (J^^  '\   where  P  =[^^'^^^^\    '  ^^U^2^\h 

We  now  compare  these  results  to  the  A3SLS  estimator,  which  we  will 
denote  by  6 .  Stack  equations  (2.1)  and  (2.2)  as 


(2.13)  y  -  X5  +  e  ,  e 


,  X  = 


X  0 
0  X, 


Let  the  2T  x  5  matrix  of  instrumental  variables  for  the  system  given  in 

equation  (2.13)be  ¥,  where 

,  e^  =  yg  -  ^2  62,  62=  (Z'X2)"''  Z'yg  • 


Z.Cg,  0 


0,0,  Z 


The  A3SLS  estimator  is  an  instrumental  variables  estimator  which 
satisfies 


(2.14)  6   =  (A^'W'X)-''  A^'w'y 
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where  the  5x4  linear  combination  matrix  A^  is  optimally  chosen.  Any 
sequence  {A^  which  satisfies  plim  A_  «=  A,  where 


-Jv-I 


(2.15)  A  =  V'  plim  (W'X/T) 


and  "^  is  the  asymptotic  covariance  matrix  of  Vr'e//T,  is  optimal  in  the 
sense  of  obtaining  the  linear  combination  of  instrumental  variables  with 

the  smallest  asymptotic  covariance  matrix  for  6 •  The  asymptotic 

A 

covariance  matrix  of  6  will  then  be 


iTt  /rn\'ir~1 


1-1 


(2.16)  Var  (6)  =  [plim(X'W/T)V"  plim(W'X/T)]'  . 

Some  care  must  be  exercised  when  applying  a  central  limit  theorem  to 
calculate  V,  due  to  the  instrument  Sp  depending  on  an  estimated 
parameter  d^-  Let  V   be  the  instrument  matrix  obtained  from  W  by 

replacing  e,  by  the  true  disturbance  e,.  Then  by  ECe.e,)  =  E(£.£_)  =  0 


rz' 


rz' 


2  2s 


and  E(e.ep)  =  o^.o^ot  *®  c^^  ^se  a  central  limit  theorem  to  obtain 


•_  1  •  //Tfr  d 


(2.17)  We//T   =  [Z'e^,  E'e^.Z'E^jV/T  -?  N(0,V) 


where 


a^^N  0     0 

0  0^^322  0 
^00     ^22  N 


Also,  since  £2=  Eg  -  7i^   (  62"  ^2^"^  ^2"  ^2^^'^2^~  ^'^2  *®  °*"^  write 
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.V    ^-U. 


W'e//T  =  W'e//T  -  [  O.e^'X^Cz'X^)"  Z'sg/ZT,    O] 


P/'e//T   , 


where,  for  !„  denoting  a  two  dimensional  identity  matrix, 


Ig   0         0 

0    1    -^^'H^iZ'T^) 
0    0         I« 


-1 


Note  that 


plim  c'^X2(Z'X2)"^    =   (plim  eJX2/T)(plim  Z'X2/T)"^    =   {a ^^e'^/d^){m)^)''^ 


Then  for  P  =  plim  P  ,   equation  (2.17)   implies 


(2.18)     W'e//T    ^  N(0,PVP'), 


so  that  V  =  PVP".  ¥e  also  calculate  that 


(2.19)   plim¥'X/T 


plim  J 


Z'X^        0 

Ej'X^       0 

0  Z'X 


MD^  0 

a^^e^Wi  0 
0  MD, 


,-1 


P"  plim(W'X/T)   = 


rwD 

^22^r/<i 


^11^1*/^ 


MD 


2      J. 
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Prom  equations  (2.16),  (2.18),  and  (2.19)  we  can  obtain  the  asymptotic 
covariance  matrix  for  the  A3SLS  estimator 


.^-^,-1  T,-1 


(2.20)  Var(6)  ={  plim(X'¥/T)(P')~  V"  P"  plim(¥'X/T)] 


i-i 


d 

1  ee  • 

d^' 


^  2  ®1®1 
'^1l/''22^^r  *^2'"V^22 


which  is  also  the  FIML  asymptotic  covariance  matrix  obtained  as  equation 


(2.11).  Consequently,  A5SLS  is  asymptotically  efficient. 


It  is  clear  that  A3SLS  must  be  more  efficient,  asymptotically, than 
3SLS  because  3SLS  ignores  the  instrumental  variable  e_when  forming  the 
instruments.  It  is  somewhat  surprising  that  A3SLS  is  efficient,  since 

A3SLS  uses  an  inefficient  estimator  when  forming  e^.  A3SLS  corrects  for 

the  use  of  Opby  using  a  different  linear  combination  of  the  instruments 

than  FIML.   Rather  than  using  y.  =  Zn.+  e ./(l-p 4oPpi )  ^^  an  instrument 

for  y  ,  i=1,2,A3SLS  is  a  system  IV  estimator,  with  linear  combination 

matrix 


.N-1  „-U-1 


(2.21)  A  =  (P*)"^  V"  P"^  plim(w'X/T) 
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The  reason  that  this  correction  for  the  use  of  the  inefficient 
estimator  Sp  in  forming  Sp  results  in  fully  efficient  estimates  is  that 

the  FIML  estimate  6  is  also  an  IV  estimator  of  the  form  given  in  equation 
(2.14).  To  see  why  this  is  so,  note  that  asymptotically  W'e//T  is  a 
nonsingular  linear  transformation  of  We  |  /T.  If  two  sets  of  instrument- 
al variables  differ  only  by  a  nonsingular  linear  transformation,  they 
span  the  same  column  space  and  therefore  lead  to  equivalent  estimators 
when  used  in  an  optimal  manner. 

Similarily,  we  can  show  that  for  the  FIML  instruments  ¥*,  it  is  the 
case  that 


(2.22)  Vr*'e//T  =  S(W'e//T  )  +  0  (I), 


1^0  0 

0  1  0 

0  0  I, 

0  1  0 


a^^e^  Yd 


0^2  e^'   /d  0 


,11 


0    e^   (/(J22<=')  ^2'  /''22 


Prom  equation   (2.22)   it  follows   that 
(2.23)       W*'e//T  =  SP"''        W'e//T  +  0   (I). 
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Ve  can  also  compute 


(2.24)  plim  W»'X/T  =  SP"  plim  ¥'X/T. 


Since  the  FIML  estimator  6  solves  B'  W*'e  =  0,  where  plimBm=B  and 


B 


^1  0 

1/d  0 

0  D^ 

0  1/d 


it  follows  from  equations  (2.24)  and  (2.25)  that 


(2.25)  /T(6  -6)  =  (B'W*'X/T)"^B'W*'e//T  +  o  (1) 
i"  p 


-  (B'S  P"^  ¥'X/T)  B'S  P~^  W'e//T  +  o  (1 )  so  that  FIML 
is  an  instrumental  variables  estimator  with  instruments  W. 
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3.  FIML  Estimation  in  the  M-eq.uation  case 

We  now  turn  to  the  general  case  where  zero  restrictions  are  present 
on  some  elements  of  the  covariance  matrix,  but  the  covariance  matrix  is 
not  necessarily  assumed  to  be  diagonal.**  We  consider  the  standard  linear 
simultaneous  equations  model  where  all  identities  are  assumed  to  have 
been  substituted  out  of  the  system  of  equations: 
(5-1)  YB  +  Z  r  =  U 

where  Y  is  the  TXM  matrix  of  jointly  endogenous  variables,  Z  is  the  TXK 
matrix  of  predetermined  variables,  and  U  is  the  TXM  matrix  of  the 
structural  disturbances  of  the  system.  The  model  has  M  equations  and  T 
observations.   It  is  assumed  that  B  is  nonsingular  and  that  Z  is  of  full 
rank.  We  assume  that  plim  (l/T)  (Z'U)  =  0,  and  that  the  second  order 
moment  matrices  of  the  current  predetermined  and  endogenous  variables 
have  nonsingular  probability  limits.  Lastly,  if  lagged  endogenous 
variables  are  included  as  predetermined  variables,  the  system  is  assumed 
to  be  stable. 

The  structural  disturbances  are  assumed  to  be  mutually  independent 
and  identically  distributed  as  a  nonsingular  M-variate  normal 
distribution: 
(3.2)  U~  N(0,£(2)l5,) 


**  .  This  set-up  is  fairly  general  since  all  linear  restrictions  on  the 
covariance  matrix  can  be  put  into  this  form  by  appropriate 
transformations  of  model  specification.  However,  an  important  case  which 
our  approach  does  not  treat  occurs  when  unknown  slope  parameteres  are 
present  in  the  covariance  matrix.   This  sometimes  will  occur  when  errors 
in  variables  are  present  in  a  simultaneous  equations  equations  model, 
e.g.,  Hausman  (1977). 
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where  E  is  positive  definite.^  However,  we  allow  for  restrictions  on  the 
elements  of  Z   of  the  form  a .  .=  0  for  i^j,  which  distinguishes  this  from 
the  case  that  Hausman  (1975)  examined.  In  deriving  the  first  order 
conditions  from  the  likelihood  functions,  we  will  only  solve  for  the 
unknown  elements  of  E  rather  than  the  complete  matrix  as  is  the  usual 
case.  Using  the  results  of  Hausman  and  Taylor  (1982)  and  Section  5,  we 
assume  that  each  equation  in  the  model  is  identified  by  use  of 
coefficient  restrictions  on  the  elements  of  B  and  r  and  covariance 
restrictions  on  elements  of  Z  . 

¥e  will  make  use  of  the  reduced  form  specification. 


(5-3)  y  =  -ZTB"^  +  UB'^  =  Z  n  +  V. 

As  we  saw  in  the  last  section,  it  is  the  components  of  the  (row)  vector 

V,  ■  (UB  ),  that  give  the  extra  instruments  that  arise  because  of  the 

covariance  restrictions.  The  other  form  of  the  original  system  of 

equations  which  will  be  useful  is  the  so-called  "stacked"  form.  ¥e  use 

the  normalization  rule  B. .  =  1  for  all  i  and  then  rewrite  each  equation 

in  regression  form  where  only  unknown  parameters  appear  on  the  right-hand 

side: 

(3.4)  X^  .[Y.  Z.],6.'  =[p.'  Y.']. 


' .   If  U  is  not  normal  but  has  the  same  first  two  moments  as  in  equation 
(3.2),  the  FIML  estimator  will  be  consistent  and  asymptotically  normal. 
However,  unlike  the  case  of  no  covariance  restrictions,  standard  errors 
which  are  calculated  using  the  normal  disturbance  information  matrix  may 
be  inconsistent,  due  to  third  and  fourth  order  moments  of  the  non-normal 
disturbances.   For  this  same  reason,  FIML  assuming  normal  disturbances 
may  not  lead  to  the  optimal  IV  estimator  when  the  disturbances  are  non- 
normal. 
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It  will  prove  convenient  to  stack  these  M  equations  into  a  system 


(3.5) 


y  =  3B  +  u 


vhere 


^1 

0 

.     0 

• 

X   = 

1 

0 

• 

«             • 

,6   = 

• 

,   u  = 

"1 

• 

^M 

• 

0 

•  h 

^M 

• 

Likewise,  we  stack  the  reduced  form  equations 


(3.6) 


=  a! 


+  V 


where  Z  =  [  I  x  z]  and  11  =  [ll.  "...  n„']  '  is  the  vector  of  reduced  form 
coefficients. 

The  log  likelihood  function  arises  from  the  model  specifiction  in 
equation  (3.1)  and  the  distribution  assumption  of  equation  (3.2): 


(3.7) 


L(B,r,E)  =  c  +  I  log  det(E)"^  +  T  log  |det  (b) 


-  |[ii:"''(YB  +  zr)'(YB  +  zr)] 


23 

where  the  constant  c  is  disregarded  in  maximization  procedures.  ¥e  now 
calculate  the  first  order  necessary  conditions  for  a  maximum  by  matrix 
differentiation.  The  procedures  used  and  the  conditions  derived  are  the 
same  as  in  Hausman  (1975,  p«  750).  To  reduce  confusion,  we  emphasize  that 
we  only  differentiate  with  respect  to  unknown  parameters,  and  we  use  the 
symbol  =  to  remind  the  reader  of  this  fact.^  Thus  the  number  of  equations 
in  each  block  of  the  first  order  conditions  equals  the  number  of  unknown 

parameters;  e.g.,  the  number  of  equations  in  (5.8a)  below  equals  the 

2 

number  of  unknown  parameters  in  B  rather  than  M  .  The  first  order 


conditions  are 

"aB 


(3.8a)  -II  '   T(B')'^  -  Y'(YB  +  2r)z''^    =  0, 


(3.8b)  -W"'  ~    ^'^^^  ^   ^^^'^  ^  °' 


(3.8c)  -^  :  TC  -  (YB  +  a:')'(YB  +  ZP)  =  0. 

bz 

In  particular,  note  that  we  cannot  postmultiply  equation  (3.8b), or  later, 

the  transformed  versions  of  equation  (3.8a),  to  eliminate  S   ,as  a 

simple  two  equation  example  will  easily  convince  the  reader. 

Let  us  consider  the  first  order  conditions  in  reverse  order.  ¥e  already 

know  some  elements  of  S  because  of  the  covariance  restrictions.  The 

unknown  elements  are  then  estimated  byo..=  (1/T)(y.  -  X.6.)'(y.  -  X.6.) 

ij        1    1  1   "^j    J  J 

where  the  6's  contain  the  estimates  of  the  unknown  elements  of  the  B  and 


* .  An  alternative  procedure  is  to  us  Lagrange  Multiplier  relationships 
of  the  type  0  =  a^.  =  (y^  -  X.6^)'  (y.  -  X.  6.)  for  known  elements  of  Z 
but  the  approach  adopted  in  the  paper  seems  more  straightforward. 
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r  matrices.  Equation  (5'8b)  causes  no  special  problems  because  it  has 
the  form  of  the  first  order  conditions  of  the  multivariate  regression 
model.  But  it  is  equation  (3.8a)  which  we  must  transform  to  put  the 
solution  into  instrumental  variable  form.  The  presence  of  the  term 
T(B')"  which  arises  from  the  Jacobian  term  in  the  log  likelihood 
function  of  equation  (2.7)  distinguishes  the  problem  from  the  non- 
simultaneous  equations  case.  ¥e  now  transform  equation  (3.8a)  to 
eliminate  T(B')   and  to  replace  the  matrix  Y'  by  the  appropriate  matrix 
of  predicted  Y's  which  are  orthogonal  to  the  U  matrix. 

First  we  transform  equation  (3.8a)  by  using  the  identity  EE~  =1: 
(3-9)  [t(B')"''2:  -  Y'(YB  +  ZT  )]£"''=  0. 


Note  the  presence  in  equation  (3-9)  of  the  term  (B')  2  which  is  the  key 
term  for  the  identification  results  in  Hausman  and  Taylor  (1982).  ¥e  now 
do  the  substitution  similar  to  the  one  we  used  for  equations  (2.5)  to 
(2.7)  in  the  two  equation  case.  For  equation  (3.8c),  we  know  that  the 
elements  of  Z    take  one  of  two  forms.   Either  they  equal  the  inner  product 
of  the  residuals  from  the  appropriate  equations  divided  by  T  or  they 
equal  zero.   To  establish  some  notation,  define  the  set  N.  as  the  indices 
m  which  denote  for  the  j'th  row  of  S  that  a.  =0.  Now  we  return  to  the 
first  part  of  equation  (3.9)  and  consider  the  ij'th  element  of  the  matrix 
product 

(3.10)  [t(B')-''e]..  =  TS  P^V  ,=  (v".  -   Z   P'^^u-  )u. 

^^    k=1    ^^  ^   kE  N.     ^     ^ 

J 
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ik  -1 

where  p   is  the  ik'th  element  of  the  inverse  matrix  B  .   Note  that  if 

no  zero  elements  existed  in  column  i  of  E  we  would  have  v'.u.  on  the 

right  hand  side  of  (3«10),  as  in  Hausman  (1975,  equation '11).  W6  now  use 

the  expression  from  equation  (3»10)  and  combine  it  with  the  other  terms 

from  the  bracketed  term  in  equation  (5-9) : 

(3.11)      [t(b')"^z  -  Y'(YB+Zr)],,  =[v.-  E   p^\-  (zn,)-v,]'u, 

^^     ^  k  e  N.    ^  1   1   J 

J 

.ki 


=  -[21+   S    PV'^i 
^    k  c  N.    ^       ^ 
3 

=  -  [zn.  +  b'  ..u.l  'u., 


where  b'..u'.  corresponds  to  the  sum  of  the  elements  p   multiplied  by 
the  residuals  u^  in  the  set  N . . 

As  with  equation  (2.7)  we  see  that  FIML  replaces  the  jointly- 
endogenous  variable  y.  =  ZII .  +  v.  with  the  prediction  from  the 
predetermined  variables  31 .  and  those  elements  of  v.  =  (UB  ) .  which  are 
uncorrelated  with  u.  because  of  zero  restrictions  on  the  a.  .'s.  Thus  we 
rewrite  equation  (3»9)  as 
(3.12)  -  [  (a7B"^  +  V)'(YB  +  2r)]E"^  =  0 


-1  u 


-  [  (Y  +  V)'(YB  +  2r)]E"  =  0. 


Equation  (3.12)  demonstrates  the  essential  difference  for  FIML  estimation 
which  arises  between  the  case  of  no  covariance  constraints  and  the 

A 

present  situation.  We  see  that  in  addition  to  the  usual  term  Y,  we  have 
the  extra  elements  V  which  are  uncorrelated  structural  residuals 
multiplied  by  the  appropriate  elements  of  B  .   Thus  FIML  uses  the 


26 

covariance  restrictions  to  form  a  better  predictor  of  Y  than  the  usual 
Y. 

Note  that  if  (B  E ) . .  =  0  in  equation  (5. 10),  equations  i  and  j  are 
relatively  recursive.  Then  y.  is  predetermined  in  the  j'th  equation 
rather  than  jointly  endogenous,  '  and  equation  (3'11)  reduces  to  the  same 
form  as  equation  (5.8b):  columns  of  Y  are  treated  like  columns  of  Z.   In 
general, however,  y.  is  replaced  by  a  predicted  value  which  is  composed  of 
two  terms:   a  prediction  of  the  mean  from  the  predetermined  variables  and 
an  estimate  of  part  of  the  reduced  form  disturbance  from  the  uncorrelated 
structural  residuals.  For  future  reference,  we  gather  together  our 
transformed  first  order  conditions  which  arise  from  equations  (3»8a)  - 
(3.8c): 


(3-13) 


(B-^)T'z'  *v'l(YB.zr)i:-^  So. 


T  'Z'  +  V'l 
Z'     J 


TE  -  (yb  +  2r)'(YB  +  zr)  =  0. 

¥e  now  calculate  the  asymptotic  Cramer-Rao  bound  for  the  estimator. 
Under  our  assumptions,  we  have  a  linear  structural  model  for  an  i.n.i.d. 
specification.  We  do ■  not  verify  regularity  conditions  here  since  they 
have  been  given  for  this  model  before,  e.g..  Hood  and  Koopmans  (1953)  or 
Rothenberg  (1973).®  Let 


"^  .     See  Hausman  and  Taylor  (1983) 

^ .  The  most  straightforward  approach  to  regularity  conditions  is  to  use 
the  reduced  form.   The  reduced  form  has  a  classical  multivariate  least 
squares  specification  subject  to  nonlinear  parameter  restrictions.  Since 
the  likelihood  function  is  identical  for  either  the  structural  or  reduced 
form  specification,  the  more  convenient  form  can  be  used  for  the  specific 
problem  being  considered. 


B  -  diagCs^.Bg,  •••,\),    \  =  [(B  "')^  0^]' 
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_  _1 


where  0.  is  an  s.xM  null  matrix  which  corresponds  to  the's.  included 

predetermined  variables  and  (B')  ~  is  the  matrix  of  rows  of  (B')~  which 

i 

correspond  to  the  r. included  explanatory  endogenous  variables.  B  is  the 
matrix  B  with  normalization  and  exclusion  restrictions  imposed.  Let  E  be 
the  ITxM  matrix  whose  ij'th  block  (i,j=1,...M)  is  given  by  E..,  a  matrix 
with  one  in  (j,i)  and  zeros  elsewhere.  With  no  restrictions  on  the 
disturbance  covariance  matrix,  the  information  matrix  for  the  unknown 
parameters  is  given  by 


(3.14)   J(6,a*)  = 


BEB 


'  +  plim^X'(2-l  (?)  Ij)X    ^(E-1  ©  l^)r 


2 
where  R  is  the  M  x  1/2M(M  +  1)  matrix  of  ones  and  zeros  that  maps  a*  = 

(a. . , . . .  ,(j„- fCTpp, . . .  ,a„p, . . .  ,0j_.)  into  the  full  vector  of  a..'s  which 

ignores  the  symmetry  restrictions:   see  Richard  (1975)«   If  L  covariances 

are  restricted  to  be  zero,  let  S  be  the  (1/2)M(M  +  1)x(1/2)M(M  +  1)-L 

selection  matrix  which  selects  the  non-zero  elements  of  a*.  The 

information  matrix  with  covariance  restrictions  is  then  identical  to  that 

in  equation  (3.14)  with  S'R  substituted  for  R:   see  Appendix  A,  equation 

(A. 15). 

The  inverse  of  the  corresponding  Cramer-Rao  bound  for  the  slope 

coefficients  is  given  by 
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(3.15) 


11^1  = 


(J") 


plim  D'(I  Q   Z')(Z-1  Q   I  )(I   0  z)D 


+  2B(S-1  ©  Ijj)R'[(R(E-l  <D  I-1)R')-1 

-  s(s'R(z-i  ^  z-i)r's)-is']r(z-i  ©  i„)b' 

where  D  =  diag(D. , . . . ,D„),  and  D.  =  [n.I.l,  where  I.  is  a  selection 
"^1      M       1   ^1  i-"        1 

matrix  which  choses  the  explanatory  variables  for  the  i'th  structural 

equation:  ZI.  =  Z  :   see  Appendix  A,  equation  (A. 14).  The  first  term  in 

equation  (3«15)  is  the  inverse  of  the  covariance  matrix  for  the  3SLS 

estimates  of  the  slope  coefficients.   Since  the  second  term  can  be  shown 

to  be  positive  semi-definite,  3SLS  is  asymptotically  inefficient  relative 

to  FIML,  in  the  presence  of  covariance  restrictions. 

For  a  diagonal  covariance  matrix,  these  expressions  simply  further. 

If  E'*  is  the  M  X  I^  matrix  given  by  [e, ,E„„. . .E,_,]  then  the  information 

1 1  22    MM 

matrix  reduces  to 


J(6,cJ^^...Oj^)  - 


BEB'  +  plim  X'(S-^  ©  I JX  BE*Z-1 


z-1e'^' 


^z-iz-1 


Similarly,  the  Cramer-Rao  bound  for  the  slope  coefficients  is  given  by 


(J^^)-I   =  B(E+E-1  @Z   -  2E*E'*)B' 
+  plim  ^D'CS-l  Ql'lll. 


We  now  wish  to  compare  these  results  with  the  limiting  covariance  matrix 
of  our  A3SLS  instrumental  variables  estimator. 
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4*  Instrumental  Variables  in  the  M-Equation  Case 

To  analyze  the  limiting  distribution  of  the  A3SLS  estimator  in  the 
general  case,  it  will  be  helpful  to  rewrite  the  FIML  estimator  from 
equations  (3-13)  in  an  instrumental  variables  form,  following  the 
derivation  in  Hausman  (1975)  for  the  unrestricted  ^    case.  Since  we  have 
one  equation  in  equation  (3- 13)  for  each  unknown  6,  we  take  each  equation 
from  the  first  block  of  the  gradient  (3' 13)  corresponding  to  the  system 
of  equations  (3*1 ),  impose  the  normalization  p..=  1  and  the  exclusion 
restrictions  on  the  slope  parameters,  and  stack  the  results  in  the  form 
of  equation  (3«5)«  We  then  have 


-1 


(4.1)  X'(S  X  I)"  (y  -  »)=  0 


80 

(4.2)      6  =  a'(E  X  l)"4)-''a'(Z  X  l)-^)y 


(Wj,'X)"''¥j,'y 


where  J.   =  diag  (X^  ,1:^, . . .  ,X^)    ,  X=  [z(rB)^  +(V)^,zJ  ,  1   has  its  known 
elements  set  to  zero  and  its  unknown  elements  calculated  by  o..=  (1/T) 
(y.  -  X.6.)'(y.  -  X.6.).  The  instniments  are  given  by 

"^         V  J   J 

(4.3)  V'   =  X'(Z  g>I)"^ 
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Since  the  instruments  depend  on  elements  of  6 ,  the  resulting  problem  is 
nonlinear  and  needs  to  be  solved  by  iterative  methods.  While  we  could 
iterate  on  equation  (4.2)  and  see  if  covergence  occured,  better  methods 
exist:  see  Hausman  (1974). 

The  ordinary  3SLS  estimator 

(4.4)  63  =  (X'[E(g)Pj-^  X)-^X'[2  0Pj-V 

can  be  written  in  the  form  of  equation  (4-2),  where  the  instruments  are 

(4.5)       w^  =  x'[i  (g>  pJ[e  ©  i]-i 

and  P  =  Z(Z'Z)~^Z'.  Compare  equations  (4.5)  and  (4-5)  and  note  that 

PIML  and  3SLS  differ  in  their  prediction  of  the  explanatory  variables 
(f.')   in  forming  their  instruments.  Ordinary  3SLS  projects  X  onto  the 
space  spanned  by  the  exogenous  variables  Z,  whereas  PIML  replaces  Y.in 
equation  j  by  Z(rB)7V  +'v.  .. 

The  system  A3SLS  estimator  differs  from  the  ordinary  3SLS  estimator 
in  the  set  of  variables  which  it  takes  to  be  uncorrelated  with  the 
structural  disturbance  in  each  equation.  For  3SLS,  these  are  simply  the 
exogenous  variables  Z  and  they  are  the  same  for  all  M  structural 
equations.  For  A3SLS,  these  instruments  are  augmented  for  each  equation 
by  the  residuals  which  are  assumed  uncorrelated  with  the  given 
disturbance  by  the  covariance  restrictions.  The  set  of  predetermined 
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variables  thus  differs  across  structural  equations  for  the  A3SLS 
estimator. 

Residuals  can  be  added  to  the  list  of  instruments  by  allowing 
W  =[  Z,(l„  X     U)sj   to  be  the  matrix  of  instrumental  variables,  where  U 


is  a  T  X  M  matrix  of  estimated  residuals  and  S  is  a  selection  matrix. 

A 

>    11 


For  example,  in  the  two  equation  example  of  Section  2,  in  order  to  use  u 


as  an  instrument  for  equation  (2.1),  let  S  =  (0,1,0,0)'  so 
that 


W  = 


Z    0    ^2 
0    Z    0 


We  will  assume  that  S  is  an  n   x  L  matrix,  where  each  column  correspond 

to  a  distinct  covariance  restriction  a .  .  =  0   for  some  i^j.  This  column 

-,  A  A 

of  S  will  either  select  u.as  an  instrument  for  equation  j  or  u.  as  an 

1  J 

instrument  for  equation  i . 

Instrumental  variables  estimators  which  use  residuals  can  be 
obtained  from 


(4.6)  6^  =  (W^X)-lw^y, 


where  W  =  WA  is  the  Txq  matrix  of  instruments,  ¥'X  is  nonsingular,  A 
is  a  (MK  +  L)  X  q  linear  combination  matrix,  and  q  is  the  dimension  of  6 . 
Depending  on  the  choice  of  An,,  many  different  system  instrumental 
variables  estimators  can  be  obtained  from  equation  (4.6).  For  example, 
if  A'  =  [^(^M  ©  Z(Z'Z)-^  ),0]  where  0  is  a  qxL  matrix  of  zeros,  then  6 . 
is  the  vector  of  2SLS  estimates. 
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The  A3SLS  estimator  will  te  6.  with  A,^  chosen  so  as  to  minimize  the 

A 

asymptotic  covariance  matrix  of  6 . . 

A 

The  asymptotic  distribution  theory  for  6.  is  complicated  by  the  fact 
that  V  contains  residuals.   Following  the  usual  instrumental  variables 
analysis,  we  substitute  y  =  3S  +  u  into  equation  (4-7)  to  obtain 

(4.7)  /T(6^  -  6)  =  (W^X/T)-l¥^u//T 

=  (a^w'x/t)-1a^w'u//t 

To  use  equation  (4.7)  to  obtain  the  asymptotic  distribution  of  6 , ,  it  is 

A 

useful  to  be  able  to  apply  a  central  limit  theorem  to  ¥'u//T,  which  is 

the  vector  of  /T  normalized  sums  of  cross-products  of  instrumental 

variables  and  disturbances  .  To  use  a  central  limit  theorem  the  presence 

of  residuals  needs  to  be  accounted  for,  which  requires  us  to  be  specific 

concerning  the  way  in  which  U  was  obtained.  ¥e  will  assume  that  each 

equation  is  identified  by  coefficient  restrictions  alone,  so  that 

rank(D. )  =  rank(rn  . , I. 1)  =  q. ,  i=1,...,M,  where  q.  =  r.  +  s.  is  the 
1        ^11      1  ^1    1    1 

dimension  of  6 .  .^  a  . 
1 

Since  plim(Z'X./T)  =  ND. ,  (i=1,...,M),  for  N  =  plim(Z'Z/T) 
nonsingular,  we  can  obtain 

(4.8)  plim[Z'X/T]  =  (1^    ©  N)'l). 

which  has  rank  q.   Let  A._  be  a  MK  x  q  matrix  satisfying  plim  A.^  =  A. , 
where  A'  plim(  Z'X/T)  is  nonsingular. 


®a.   Estimation  when  covariance  restrictions  are  necessary  for 
identification  is  treated  in  Newey  (1985).   Included  in  this  treatment  is 
a  relatively  simple  means  of  obtaining  an  initial  estimator,  and  the 
appropriate  A3SLS  estimator. 
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Let  6  be  an  instrumental  variables  estimator  which  satisfies 

(4.9)       6  -  (A;j'z'x)-iA;^z'y. 

/ 

We  now  will  assume  that  the  matrix  of  residuals  U  is  obtained  from  U, .  = 

ti 

Jjx  -  ^-i+^-i'  (■t=1.-".T,  i=1,...,M).  That  is,  the  residuals  used  to  form 

A 

the  instrument  matrix  ¥  are  obtained  from  an  estimator  6  which  uses  the 
predetermined  variables  Z  as  instruments.  For  example,  U  could  be 
obtained  from  the  2SLS  or  the  5SLS  estimator.   This  assumption  allows  us 
to  purge  6  from  W'u//T  as  follows.  Let 


T 

"2T  "  "t  ^  "t  ^    (BU|/a6)]/T, 


t=1 


where  U     is  the  tth  row  of  U  and  aU'/96   =  -  diag[x     ,...,X^  ].     Then 


Ta 


(4.10)  S'djj    ®    U)'u  =  S'  Z   U'^(x)    U^ 


t=1 


T 
=  'S'[   E   U^©    U^-  TMgj   (6   -  6)] 

=  S*[(Ijj@U)'u-TM2j(a;^Z'X)-U^j'Z'u] 


where  the  last  equality  is  obtained  by  substituting  y  =  X6  +  u  into 
equation  (4.9).  Let  P  and  W  be  denoted  by 
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MK 

-ts'M2j(aj^'z'x)-ia;^  1^ 


.  w  =  [z.djj  ©  u)s], 


Rote  that  W  is  obtained  from  W  by  replacing  U  by  the  true  disturbance 
matrix  U.  It  follows  from  equation  (4-10)  that 


(4.11) 


V'u//T  =  P  ¥'u//T, 


80  that  W'u//T  is  a  nonsingular  linear  combination  of  ¥'u//T,  which 
consists  of  cross-products  of  predetermined  variables  and  disturbances. 
It  is  now  straightforward  to  use  a  central  limit  theorem.  Let  e   = 
(U.  ©U,  )S  be  the  1  x  L  vector  of  cross-products  of  true  disturbances 
corresponding  to  the  zero  covariance  restrictions.  Suppose 
that, conditional  on  the  1  x  K  vector  Z,  of  contemporaneous  predetermined 
variables  ,  U,  has  constant  moments  up  to  the  fourth  order.  Then  by  the 
orthogonality  of  disturbances  and  predetermined  variables,  the  covariance 
restrictions,  and  the  absence  of  autocorrelation  an  appropriate  central 
limit  theorem  gives 

(4.12)  W'u//T  ^     N(0,V) 


1 


V  =  plim^  I    E([U^©  Z^,  eJ'[U^Z^,eJ  |  Z^) 

X~  1 


E  ©  N 

T 
E(e.  *U.)  ©plim(Z  Z./T) 
^  ^       t=1  * 


12 


E(^'s) 


where  the  last  equality  follows  by  e,'(U@  Z  )  =  (e. 'U  )  (^Z,. 
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If  the  disturbances  are  normally  distributed  then  the  off  diagonal  blocks 
of  V  are  zero,  since  all  third  order  moments  of  a  joint  normal 
distribution  are  zero.  Let  M„  =  plim  M^  ,  so  that 


(4.13) 


Mg  "   P^i"^  T 


diag(u;x^,...,u'Xj^) 
diag(uj;x^,.,.,u^) 


E(lj^Z)B' 


where  E .  is  the  ith  row  of  E (i=1 , . . . ,M) ,  plim(u.X./T)  =Z.B.,and  the  last 

equality  in  equation  (4-14)  follows  from  Lemma  A1  of  Appendix  A. 

Also,  let  plim  P_  =  P,  so  that  equations  (4.8)  and  (4.13)  we  can  compute 


(4.14) 


MK 


-'S'M2(a;(i^  ^  n)']3)-ia;  i^ 


Then  by  equation  (3.7)  we  have 
(4.15) 


A        _  d 

W'u//T  ■*■     N(0,PVP'), 


80  that  rVP'  is  the  asymptotic  covariance  matrix  of  ¥'u//T. 

A 

The  rest  of  the  derivation  of  the  asymptotic  distribution  of  6,  is 

A 

straightforward.   Our  assximptions  are   sufficient  to  guarantee  that 
plim  6  =6.  Then  by  plim(U'X./T)  =Z   B.  (i=1  , . . .  ,M) ,  and  by  equation 


(4.8), 


(4.16) 
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plim  ¥'X/T  =  plim 


(Ijj  @  N)  D 


Let  G  denote  G  =  plim  W'X/T.  ¥e  will  assume  that  plim  A  =  A  and  that 
A' 6  is  nonsingular.  By  equations  (4»7),  (4-15)f  and  (4. 16)  we  obtain 


(4.17) 


/T(6,  -6)  ^     N(0,(A'G)-1a'PVP'A(G'A)-1  ) 

A 


The  A3SLS  estimator  is  obtained  by  choosing  an  optimal  linear 
combination  matrix  A.  We  will  assume  that  the  covariance  matrix  V  is 
nonsingular.  In  the  normal  disturbances  case,  the  nonsingularity  of  V 
follows  from  previous  assumptions:  see  Lemma  5.1  below  .  Note  also  that 
P  is  nonsingular.  Since  the  asymptotic  covariance  matrix  of  V'u//T  is 
PVP',  the  linear  combination  matrix  A*  which  minimize  the  asymptotic 
covariance  of  the  instrumental  variables  estimator  6 .(e.g. ,  see  White 
(1982)  satisfies 
(4.18)  A*  =  (PVP')-1g=  (P-M'V-lp-lG. 


The  asymptotic  covariance  matrix  of  6.  will  be 


(4.19) 


Var(6._)  =  (G'(P-M'V-lp-lG)-l 
A* 


Since  W'X/T  is  a  natural  estimator  of  its  probability  limit  G,  we 
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only  need  to  consider  consistent  estimators  of  P  and  V  to  obtain  an 
estimate  A*  satisfying  plim  A*  =  A*p  which  is  required  for 
implementation  of  A5SLS.  ¥e  will  assume  that  sample  averages  of  up  to 
fourth  order  cross-products  of  elements  of  X  =  [y. ,Z  ](t=1 , . . . ,T)  have 
finite  probability  limits.   Then  by  plim  6  =  6,  a  consistent  estimator 

A  A  ^  ^     ^ 

of  V^  of  V  can  be  obtained  as  follows.   Let  e.=  (U.^U.)S  and 


A 

Q 


T  A 


t=l 


"12 
21  ^22 


Then  we  can  take 


(4.20) 


^T  = 


I      I     (Z'Z/T) 
Q21      X   (  I    Z^/T) 


t=1 


A 

V 

A 

Q 


12T 


22 


A  consistent  estimator  P^  of  P  can  be  obtained  by  replacing  M^^  in  the 

A  A 

definition  of  P_,  by  an  estimate  Mp„  satisfying  plim  Mp^,  =  M_.  By  plim 
6  '^  6  such  an  estimator  is  given  ty 


"2T  =  ^^^M^"  "^'^/"^ 


80  that  we  can  let 
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(4.21) 


^.  '^ 


MK 


s-  M2^(a;^'z'x/t)-ia;^    i^ 


Then  we  can  take  A*  =  (p-l)'V-l  P=1W'X/T,  so  that  the  A3SLS  estimator  can 


be  obtained  from  equation  (4-6), 


(4.22) 


A   /\       A  A         A 


A    .    A       A 


6*,  =  (X'W(PlM'Vrlp-lWX)-lX'W(p-l)'V;lp-lWy 


A  consistent  estimator  of  the  asymptotic  covarianCe  matrix  of  6.^^  is  also 

given  by  . 

(4.23)  Var(6j^^)  =  [(X'¥/T)(P-1  )  "V"!  P"!  (rX/T)]-l  . 


Computation  of  the  A3SLS  estimator  (and  estimating  its  covariance 
matrix)  is  only  a  little  more  laborious  than  computation  of  3SLS  except 

A 

for  the  presence  of  P~^  in  equation  (4.22).   Note  though  that  because  P 


is  block  triangular, 

A 

P-1 

T 


(4.24) 


MK 


0 


^•m2^(a;^z'x/t)-ia;^        i^ 


80  that  P_  need  not  be  inverted  numerically.  Also,  M^  consists  mostly 

A 

of  zeros,  and  if  the  estimator  6  used  to  form  the  instrumental  variable 
residuals  U  is  2SLS,  then  both  (A*^ZX/T)-1  and  Aj^  will  be  block 
diagonal.^ 


^.   It  may  appear  that  FIML  is  more  easy  to  compute  than  A3SLS. 
However,  A3SLS  is  a  linear  estimator  (once  U  is  chosen)  so  that  no 
iterative  process  is  needed  to  obtain  A3SLS.   FIML  is  also  complicated  by 
the  fact  with  covariance  restrictions,  it  is  more  difficult  to 
concentrate  out  the  covariance  matrix  parameters. 
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It  remains  for  us  to  demonstrate  that  for  the  general  M-equation 
case,  if  the  disturtances  are  normally  distributed  then  A5SLS  is 
asymptotically  efficient.  To  obtain  this  result,  it  is  convenient  to 
simplify  the  asymptotic  covariance  matrix  for  A5SLS.   Let  the  (MK  +  L)  z 
q  matrix  G  be  denoted  by  G  =  P~^  G,  so  that  the  asjrmptotic  covariance 
matrix  of  A3SLS  is  (G'V-^G)-^.  From  equations  (4-8)  and  (4-10)  it 
follows  that, 


(4.25) 


G  = 


"(lj^g>N)'l) 


S  (E  +  I  J(I  ^Z)  B* 
M    M 


To  prove  efficiency  of  A3SLS  we  can  compute  G'V'^G  for  the  normal 
disturbance  case  and  show  that  this  matrix  is  equal  to 

Jet.  -  J  j~.  (J  )'^  J       •  where  a  =  a*S   is  the  (1/2)m(M  +  1)-L  vector  of 
**     *°    To'         c6' 

distinct,  unrestricted  elements  of  Z  -     Since  the  derivation  of  this 

equality  is  tedious,  we  relegate  it  to  Appendix  B. 

Lemma  4-1:   If  N  =  plim(Z'Z/T)  and  Z   =  E(U'U  )  are  nonsingular  and  U  has 

a  joint  normal  distribution  then  V  and  J    are  nonsingular  and 

aa' 


g*v-1g  =  Js,  .  -  J  ,(J   )-^  J 
6a  aa  a6 


The  reason  that  A5SLS  is  efficient  in  the  general  case  is  the  same  as  for 
the  example  of  Section  2.  Using  computations  similar  to  those  of  Section 
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2  (see  equations  (2.24)  and  (2.25))  we  can  show  that  there  is  a 
nonsingular  matrix  H  such  that 


(4.26)  W^,u//T  =  HW^u//T  +  0^(1),  W^»X/T  =  HV^X/T  +  o  (l) 


80  that, asymptotically,  the  Aj5SLS  instruments  are  a  nonsingular  linear 

transformation  of  the  FIML  instruments. 

Lemma  4-1  also  has  implications  for  the  question  of  identification 

of  the  system  of  equations  using  covariance  restriction.  When  the 

hypothesis  of  Lemma  4-1  are  satisfied,  J  ,  is  nonsingular  so  that  the 

aa 


information  matrix  J  is  nonsingular  if  and  only  if  the  q  dimensional 
^66 


square  matrix  J--,  -  J   (J   )~^ J    is  nonsingular.   Since  V"^  is  also 


6cr '  aa'         a5  ' 

nonsingular,  it  follows  that  J  is  nonsingular  if  and  only  if  G  has  rank 

q.  Therefore,  local  identification  of  the  parameters  of  a  system  of 
simultaneous  equations  subject  to  covariance  restrictions  is  related  to 
the  matrix  G  .   In  the  next  section  we  will  derive  necessary  and 
sufficient  conditions  for  local  identification  by  studying  the  properties 
of  G. 
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5«   Identification 

During  the  discussion  of  A3SLS  we  assumed  that  the  coefficient 
vector  6  of  the  system  of  equations 

y  =  X  6  +  u 
is  identified  by  coefficient  restrictions  alone.  It  is  well  known, 
though,  that  covariance  restrictions  can  help  to  identify  the  parameters 
of  a  simultaneous  equations  system  (see  the  references  in  Hsiao  (1983)) • 
Hausman  and  Taylor  (1985)  have  recently  provided  necessary  and  sufficient 
conditions  for  identification  of  a  single  equation  of  a  simultaneous 
system  using  covariance  restrictions,  and  have  suggested  a  possible 
interpretation  of  identification  of  a  simultaneous  system  which  is  stated 
in  terms  of  an  assignment  of  residuals  as  instruments.   In  this  section 
we  show  that  a  necessary  condition  for  first  order  identification  is  that 
there  must  exist  an  assignment  of  residuals  as  instruments  which  has  the 
property  that  for  each  equation  the  matrix  of  cross-products  of 
instrumental  variables  and  right-hand  side  variables  has  rank  equal  to 
the  number  of  coefficients  to  be  estimated  in  the  equation. 

Lemma  4.1  implies  that  a  necessary  and  sufficient  condition  for 
nonsingularity  of  the  information  matrix  J  is  that  the  matrix  G  have  rank 
q,  which  is  also  equivalent  to  the  condition  that  the  Jacobian,  with 
respect  to  unknown  parameters,  of  the  equation  system 
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(5.1)  nB  +  r  =  0,  S-B'QB  =  0 

has  full  rank,  when  11  and  Q  are  taken  to  be  fixed  constants  and  the 
parameter  restrictions  are  imposed.  The  matrix  G.  =  plimCz'X/T)  is 
familiar  from  the  analysis  of  identification  via  coefficient 
restrictions.  The  matrix  G^  has  an  interesting  structure.  The  ith  row 
of  G^,  corresponding  to  the  covariance  restriction  a.  .    =0,  has  zero  for 

•-  J- J 

each  element  except  for  the  elements  corresponding  to  6 . ,  where  Z.(B~^). 

1        J     1 

«  plim(u].X./T)  appears,  and  the  elements  corresponding  to  6  .,  where 
E,(B~^).  =  plim(uIX./T)  appears.  For  example,  in  a  three  equation 
simultaneous  equation  system,  where  the  Ath  row  of  G  ,  which  we  will 
denote  hy  (Gp)„  ,  corresponds  to  0._  =  0,  we 
have 


(G^)^    =  [plim(u^X^/T),  0,  plim(ujX^/T)]. 


We  can  exploit  this  structure  to  obtain  necessary  conditions  for 
identification  which  are  stated  in  terms  of  using  residuals  as 
instruments. 

An  assignment  of  residuals  as  instruments  is  a  choice  for  each 
covariance  restriction  to  either  assign  u.  as  an  instrument  for  equation 
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j  or  assign  u .  as  a  residual  for  equation  i,  but  not  both,  ¥e  can  think 
t) 

of  an  assignment  of  residuals  as  instruments  to  equations  as  an  L-tuple 

a  «  (a  4 , . . . ,a  T )»  where  p  indexes  the  assignment  and  each  element  a  „ 
p   ^  p1 '     pL  '       ^  pt 

of  a  corresponds  to  a  unique  restriction  a.  .,   with  a  „  =  i  if  u,  is 

assigned  to  equation  i  and  a  =  j  if  u .  is  assigned  to  equation  j . 

pL  1 

Since  for  each  covariance  restriction  there  are  two  distinct  ways  of 

assigning  a  residual  as  an  instrument,  there  are  2  possible  distinct 

assignments. 

For  each  assignment,  p,  of  residuals  as  instruments  let  U  .  be  the 

(possibly  nonexistent)  matrix  of  observations  on  the  disturbances 

assigned  to  equation  i.  Let  ¥  .  =  [Z,U  .]  be  the  resulting  matrix  of 

instrumental  variables  and  C  .  =  plim  (¥'.X./T)  be  the  population  cross- 
pi   -^      pi  1  ^  -^ 

product  matrix  of  instrumental  variables  and  right-hand  side  variables 
for  equation  i,  i=1,...,M.  The  following  necessary  condition  for 
identification  is  proved  in  Appendix  B. 


Theorem  5.1:  If  N  =  plim(Z'Z/T)  and  I  are  nonsingular,  then  if  the 

information  matrix  is  nonsingular  there  exists  an  assignment  p*  such 

that 

(5.2)  rank(C  ^. )  =  q^ ,  i=1 , . . . ,M. 

Theorem  5»1  says  that  a  necessary  condition  for  first-order  local 

identification  is  that  after  assigning  residuals  as  instruments  the 

cross-product  matrix  of  instrumental  variables  and  right-hand  side 

variables  has  rank  equal  to  the  number  of  coefficients  in  the  equation. 

Note  that  if  rank(C  . )  =  q.  there  must  be  at  least  q.  instrumental 

pi     1  1 

variables  for  the  ith  equation.  ¥e  can  thus  obtain  the  following  order 
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condition  from  Theorem  5-1 •  Let  a  =  max(0,q.  -  K)  (i=1,...,M). 

Corollary  5«2:   If  £  and  N  are  nonsingular  then  J  nonsingular  implies 
that  there  exists  an  assignment  of  residuals  as  instruments  such  that  at 
least  a  residuals  are  assigned  to  equation  i,  for  i=1,...,M. 


For  the  ith  equation,  a.  is  the  number  of  instrumental  variables  which 

are  required  for  estimation,  in  addition  to  the  predetermined  variables 

Z.  Therefore,  Corollary  5*2  says  that  first-order  local  identification 

implies  the  existence  of  an  assignment  of  residuals  such  that  there  are 

enough  instruments  for  each  equation.  Following  Geraci's  (1976)  analysis 

of  identification  of  a  simultaneous  equation  system  with  measurement 

error  we  can  obtain  an  algorithm  for  determining  whether  or  not  such  an 

assignment  exists.   Let  R  be  the  set  of  indices  SL    such  that  a..  =  0  for 

some  I ,    for  all  1=1 , . . . ,M.  For  a.  >  0,  let  R. , . . . ,R  be  a.  copies  of  R  . 

M 
Let  R  be  the  ^      a.  tuple  with  components  equal  to  R.  for  (j=1,...a., 
i=1   ^  J  1 

i«=1,...M). 


Theorem  5«3:   There  exists  an  assignment  of  residuals  as  instruments  such 

that  for  each  i=1,...,M  at  least  a.  residuals  are  assigned  as  instruments 

_ 

to  each  equation  if  and  only  if  for  each  n=1,...,  J|   a.  the  union  of  any 

i=1   ^ 

n  components  of  R  contains  at  least  n  distinct  indexes  i  . 
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So  far,  each  of  the  identification  results  of  this  section  have  been 

stated  in  terms  of  the  number  and  variety  of  instruments  for  each 

equation;  see  Koopmans  et.  al.  (1950).  It  is  well  known  that  when  only 

coefficient  restrictions  are  present  the  condition  that  plim(Z'X./T)  have 

rank  q.,i=1 M  is  easily  translated  into  a  more  transparent  condition 

on  the  structural  parameters  A  =  [B',r']'.  ¥e  can  also  state  an 

equivalent  rank  condition  to  plim(W'.X.T)  having  rank  q,  when  covariance 

restrictions  are  present.  For  an  assignment  p,  let  Z  .  be  the  rows  of  S 

corresponding  for  residuals  which  are  assigned  as  instruments  to  the  ith 

equation,  i=1,...,M.  Let  (J»   be  the  (M-1-q.)  x  MK  selection  matrix  such 

that  the  exclusion  restrictions  on  the  ith  equation  can  be  written  as 

*,A,  =  0,  where  A.  is  the  ith  column  of  A. 
^1  i  1 

Lemma  5.4:  For  a  particular  assignment  p  and  an  equation  i,  the  rank  of 
equation  C  .  equals  q.  if  and  only  if  rank  [a'(j)  .' ,2  '  .  ]  =  M-1  . 

We  prove  this  result  in  Appendix  B.  Together  with  Lemma  5.1,  Lemma  5*4 
implies  the  following  necessary  rank  condition  for  identification  of  a 
linear  simultaneous  equations  system  subject  to  covariance  restrictions. 

Theorem  5-5:   If  Z  and  N  are  nonsingular,  then  nonsingularity  of  J 
implies  that  there  exists  an  assignment,  p*,  of  residuals  such  that 
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(5.3)  rank[A'4.^,E^^^^]  =  M  -  1,  (i=1 M). 

This  rank  condition  is  a  strengthening  of  Fisher  (1966).  Fisher  shows 
that  if  a  system  of  equations  is  first-order  locally  identified  then 


(5.4)  rank[A>^,S^]  =  M  -  1, 


where  E.  is  the  matrix  of  all  rows  (s ),  of  S  such  that  a.,  =  0.  Theorem 
1  k  ik 

5*5  strengthens  this  condition  by  requiring  that  equation  (5-3)  only  hold 
for  those  rows  of  ^  corresponding  to  residuals  which  are  assigned  to 
equation  i. 

It  is  sufficient  for  first-order  local  identification  that  the  rank 
of  G  equals  q.   It  would  be  useful  to  have  other  sufficient  conditions 
for  local  identification  which  are  more  readily  interpretable  in  terms  of 
the  structural  parameters.  We  can  obtain  a  sufficiency  result  which  is 
the  system  analog  of  Lemma  5-4 

Theorem  5.6:  The  rank  of  G  equals  q  if  and  only  if 


(5.5)  rank  ([diagC*^,  . . .  ,(|.j^,  's' )]•[  I^j  x  A' ,  (Ij^©e)(E+I  3)] ')  =  M^-M. 

M 
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6.  Testing  the  Overidentifying  Covariance  Restrictions 

Since  covariance  restrictions  specify  that  the  distribution  of 
unobservables  has  certain  properties,  such  restrictions  may  have  weaker  a 
priori  justification  than  coefficient  restrictions,  so  that  it  is  useful 
to  have  available  a  test  for  the  validity  of  overidentifying  covariance 
restrictions.  The  case  we  considered  for  the  A3SLS  estimator  the 
simultaneous  system  was  identified  in  the  absence  of  covariance 
restrictions,  so  that  these  restrictions  only  help  in  obtaining  a  more 
efficient  A5SLS  estimator  of  the  structural  parameters  6 .   If  the 
restrictions  are  false  then  the  A3SLS  estimator  will  not  be  consistent. 
We  can  use  these  facts  to  form  a  Hausman  test  based  on  the  difference  of 
the  3SLS  and  A3SLS  estimators  .  Consider  the  test  statistic 

(6.1)   m  =  T(6^,-65gj^g)'[Var(65gj^g)-Var(6^*)]-(6^,  "^3313)' 

where  A  denotes  a  generalized  inverse  of  a  matrix  A.  Under  the  null 
hypothesis  that  the  covariance  restrictions  are  true  this  test  statistic 
will  have  an  asjrmptotic  chi-squared  distribution.  Except  in  an 
exceptional  case,  where  adding  an  additional  covariance  restriction  a.  .  = 
0  does  not  improve  the  efficiency  of  enough  components  of  6 ,  the  degrees 
of  freedom  of  this  test  will  be  min(q,L).   The  case  L  <  q  is  of  most 
practical  interest,  since  it  seems  unlikely  that  more  covariance 
restrictions  than  structural  parameters  will  be  available  (e.g.,  see  the 
example  of  Section  2).  When  m  has  degrees  of  freedom  L  and  the 
disturbances  are  distributed  normally,  then  because  3SLS  and  A5SLS  are 
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asymptotically  equivalent  to  PIML  without  and  with  covarlance 
restrictions,  respectively,  m  will  be  asymptotically  equivalent  to  the 
classical  tests,*  see  Holly  (1982).  When  the  disturbances  are  not 
normally  distributed  this  test  will  have  the  optimality  properties 
discussed  in  Newey  (1983). 

The  statistic  m  can  be  computed  by  forming  a  Hausman  (1978) test  on  a 
subset  of  L  coefficients.  This  method  corresponds  to  a  particular 
choice  of  generalized  inverse  of  the  difference  of  variance  matrices  in 

equation  (6.1),  but  the  test  statistic  will  be  numerically  invariant  to 

A 
such  a  choice  of  g-inverse  as  long  as  the  same  estimator  S  of  Z  is  used 

throughout:  see  Newey  (1983).  Thus,  the  specification  test  proposed  here 

is  asjrmptotically  equivalent  under  the  null  hypothesis  and  local 

alternatives  to  the  ¥ald  and  LM  tests  which  seem  more  difficult  to 

compute.  Since  it  is  likely  that  both  the  A3SLS  and  3SLS  estimators 

would  both  be  computed  in  an  applied  situation,  comparison  of  the 

estimates  provides  a  convenient  test  of  the  underlying  covariance 

restriction  assumptions. 
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APPENDIX  A 


¥e  will  first  enumerate  some  properties  of  the  matrix  E  which  will 
be  useful  when  obtaining  the  information  matrix.  Let  A  and  B  be  M- 
dimensional  square  matrices.  Let  A. be  the  ith  row  of  A,  (i=1,...,M). 
Let  Q  be  a  matrix  obtained  from  R  by  Q.=  R.for  each  row  of  R 
corresponding  to  o,=   0  and  Q.  =  (1/2)  R.  for  each  row  corresponding  to 


Lemma  A1 :  The  matrix  E  satisfies 


(i) 


E'  =  E, 


(ii) 


E'  -  I  ,, 
M 


(iii) 


E(Ag)B)E  =  B  g)A, 


(iv) 


E(ljj®A)  = 


(v) 


(1/2)(E  +  I  5)  =  R'Q, 


(vi) 


ER*  =  R',  EQ'  =  Q', 


(vii)     For  A  nonsingular,  3  An  |  detAJ/dvecA  a(vecA)'  = 
-  (I„©a'^')  E  (I  0A'^ 


50 
Proof:  (i):  Follows  from  E.  .'  =  E...(ii):  The  ijth  block  of  E  equals 

TE  ^E,  .   Also,  E  .E.  =0  for  i*j  and  E.E.=  E   so  that  EE  .E.  =  0  for 
m  mi  jm         mi  jm  mi  im  mm        m  mi  jm 

i*J  and  EE  ,E.  =  EE  =1  .(iii):  The  ijth  block  of  E(A  x  B)E  is 

m  mi  im  m  mm   m 

EEa  E  .BE.  =  b.  .EEa  E  =  b.  .A.  (iv):   The  inth  block  of  E(I,jc  A)  is 
mn  mn  mi  jn    ijmn  mn  mn   ij  M         . 

E,.A,  which  is  a  matrix  of  zeros,  except  for  the  jth  row,  where  A. 
ji  1 

appears. (v):  Number  the  rows  of  R'  by  h(M-1 )  +  i,  h,  i=1 , . . .M.   Also, 

number  the  columns  of  Q  by  j(M-l)  +  k, j,k=1 , . . .M.   Note  that  the  h(M-l) 
+  i  and  i  (M-1 )  +  h  rows  of  R'  are  identical,  because  these  rows  each 

give  a.,  in  the  equality  a   =  R'o^.   Similiarly  the  j(M-l)  +k  and  k(M-1 ) 

+j  columns  of  Q  are  identical.  Also  note  that  the  h(M-1 )+i  row  of  R'  is 

all  zeros  except  in  ecxect  for  a  one  in  the  place  which  selects 

a,   from  o^-     The  j(M-1 )+k  column  of  Q  is  all  zeros  except  for  a  one 

(one-half)  in  the  place  which  selects  o  .,  from  a^,    for  j  =  k(j^k).   It 

follows  that  the  i(M-1 )+i  row  of  R'Q  has  a  one  in  the  i(M-1 )+i  place  and 

zeros  elsewhere,  and  that  the  h(M-1 )+i  row  of  R'Q,  for  l^i  has  one-half 

in  the  h(M-l)+i  element  and  the  i(M-l)+h  elements  and  zeros  elsewhere. 

Then  the  h(M-l)+i  row  of  R'Q-(1/2)I  ^  has  a  1/2  in  the  i(M-l)+h  place 

M 

and  zeros  elsewhere.   Consider  the  i(M-1 )+h  of  (1/2)E,  which  will  be 
the  hth  row  of  (1/2)[e.  .,  Ep.  ,  ...  ,  E...]  ,  and  will  thus  have  zeros 
elsewhere  but  in  the  h(M-1 )  +i  position,  where  a  1/2  will  appear,   (vi) 
It  is  known  that  QR'  =  I«f«+^ Wo^  s®®  Richard  (1975).  Then  ER'  =  ER'QR"  = 

E(E  +I)R'/2=  (I  +  E)R'/2  which  implies  ER*  =  R'.   The  proof  of  EQ'  =  Q' 
is  similar,   (vii):   Suppose  detA  <  0.   By  Theil  (1971),  SAn  |  detA  |  /d 


a 
so 


=  dJln(-detA)/da.  .  =  (-1 /detA)[d  (-detA)/aa.  .]    =  a*'    ,   where  A*     =  [a  '']  , 
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that  d^An  1  detAl  ./aa^.aaj^  =  a''^^^. 


-1^., 


-1 


Consider  the  j,  kth  block  of  -(l„(x)A  )'E(I„(x)A  ),  which  is 
-A~^  '  \-A~^.  The  i,h  th  element  of  this  matrix  is  -(A?^^ ) 'Ej^.aT^  = 

-a  a''  ,  where  A~  is  the  Ath  column  of  A"  .  Since  this  order  is  the 
same  used  to  form  vecA=(a..,  ...,  a  . ,  a.., 


'  ^M2  •••'^MM^'  ^""^^^ 


follows. 

The  information  matrix  is  given  by 


(A.1)  J(6,a)  =  plim-  ^ 


where  H  is  the  Hession  matrix  of  the  log  likelihood  function  L„  given  in 
equation  (5*7)  and  o  is  a  vector  of  elements  of  E .  ¥e  derive  the 
information  matrix  by  ignoring  symmetry  of  S  when  taking  derivatives  of 
L_,  followed  by  accounting  for  symmetry  of  Z  by  transforming  J(6,a),  as 
in  Richard  (1975). 

By  matrix  differentiation  •  • 

Using  the  exclusion  restrictions  and  Lemma  Al(vii),  we  compute 


(A. 3)  a  An  I  det  B  |/56d6  '  =  B  E  B' , 
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so  that 

(A. 4)   J^^,    =  B  e'b'      +  plim     [X'(e"^<DI^)  X]/T. 

Lemma  Al(vii)    is  also  useful  in  obtaining  Jao ' ,    for  a  =  vec  2,    since 


(A. 5)     a^An  det  S/Qoao'    =  -(Ij^^e"^)  E   (lj^(g)s'^). 


¥e  can  use  the  result  that  for  a  non-singular  matrix  A, 


(A. 6)  aA"^  /aa.  .   =  -  A"^E.  .  A~^ 


and  matrix  differentiation  to  obtain 


(A.7)  a^u'  (E"^(g)Ij)  Vaa^^ScTj^  =  u'(e'''e^^E"''Ej^E"^®I^)  u 


+  u'  (2""'Ej^E""''Ej^^e"^  Q>1^)   u 


-1     -1  -1     -1 

=tr(E   (E  E   E  +  E   E   E  )e  U'U) 

Am    hk   hk     Am 


where  the  last  equality  follows  by 


u'(A0I  )  u  =  tr(A'U'U'), 
T 
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for  any  M  dimensional  square  matrix  A.  This  equation  and  plim(U'U/T)  = 
imply 


1m  nlc 
(1/2)  [tr(z"''E^^  ^"^\k^  "  *^^^'^\k  ^'^^Am^l  =  A^,  so  that 


2 

(A. 9)  plim  -i  4^[  J  "'(2"^  ^IJ   u  ]  =  (I„(2)i:"b  E(I  ©Z"^) 
dodo' 


Equations  (A. 9)  and  (A. 5)  imply 


(A. 10)  J~~.  =  (1/2)  (Ijj®z"^  E  (Ij^©e"^) 


To  obtain  J  .  "  i »  we  again  use  equation  (A. 6)  to  compute 
o  o 

(A. 11)  b\/  b&da^   =  -  (1/2)  X'  [(z'^Ej^z"''  +  z'^'Ej^j^  z'^  )ffj  I^  ]u 
from  which  we  compute 


(A. 12)  J^^,  =  (1/2)  B(z"''©Ij)  (E  +  I  ) 


To  complete  the  derivation  of  the  information  matrix,  we  now 

incorporate  the  symmetry  restrictions  a..   =  a  . . ,   i*j.  Let  a*   =  (o...... 

o„ttO^-,    ...o„„,  •••<?«„).  By  Lemma  A1  and  ER'  =  R'  for  the  matrix  which 
Ml   22      M2      MM 
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has  zeros  and  ones  and  satisfies  vec  S  =  R'o*',  the  information  matrix  is 


(A. 13)  J(6,a*)  = 


B  e'b'  +  plim  X'(z'^g>I^)  X/T      B(r^@Ijj)R' 


21 


(l/2)R(E"^(g)Z"'')R' 


For  the  case  of  covariance  restrictions,  the  information  matrix  is 
obtained  from  equation  (A. 13)  by  deleting  the  rows  and  columns  of 
J(6  , a*) corresponding  to  those  covariances  restricted  to  be  zero.   If  L 
covariances  are  restricted  to  be  zero,  let  S  be  the  (l/2)  M  (M+1 )  x 
(1/2)  M  (M+1)  -  L  selection  matrix  for  which  a  S  is  the  vector  of 
unrestricted  elements  of  a.   For  example,  for  M=2  with  the  restriction 
a.„"=  0  imposed  we  have 


S  = 


1 

0 

0 

0 

0 

1 

The  information  matrix  with  covariance  restrictions  is  then 


(A. 14)  J  (6,  a**)   = 


B  e'b  +  plim  X'{Z~   W  I  )  X/T  B(2  '^^I  )R'S 

K 

(1/2  )S'R(Z'''^Z"*')R'S 
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To  compute  the  Cramer-Rao  lower  bound,  we  use  the  fact  that 


(A. 15)  (R(E"^g)E"'')R')"^  =  Q(2|S>E)  Q" 


Lemma  Al(iii),  (v),  (vi)  yields 


(A. 16)     B  (S"^g>lj^)  2R*Q  (S(x)S)  Q'R(e"''^Ij^)  B' 

=  B  E  B'  +  B  (Z   ^E)  B' 


We  can  also  compute 

(A. 17)  plim  X'(e"^  (g)Ij)  X  =  D*  (z"''  ^  N)  D  +  B"  (e"''(x)S  )  B. 

where  N  =  plim  Z'Z/T.   Subtracting  and  adding  equation  (A. 16),  we  can 
obtain  the  inverse  of  the  Cramer-Rao  lower  bound 


(A.  18)  (J^^)~^  =  B  e'b'  +  plim  X'(z"^(x)Ij)  X/T 


-2B(Z"^  g)Ij^)R's[s'R(z"'g)Z"^)R'S]"'s'R(z"'(Dljj)B, 


=  D'(E"''g)D)'l) 


-1o.1t,^.-1/ 


+2B(e"  (5)Ij^)R'[f"  -  S(S'FS)"  S']r(z"(x)Ijj)B 


where  F  =  R(z"^g)Z~^  )R' 
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APPENDIX  B 


Proof  of  Lemma  4.1:   For  normally  distributed  disturbances  V   and 

V   are  each  zero  matrices.  By  S  and  N  nonsingular,  V   =  Z  ^ N  is  also 

2 
nonsingular.   Let  R  and  Q  be  the  M  x  (1/2)M(M  +1)  matrices  defined  in 

Appendix  A.  Let  U^=  (U^^ ,  U^^»  U^^'  •••»  ^tM*  "tr  ^t2' ' ' * '^tM^ '  ^®  *^® 
(1/2)M(M+1)  X  1  vector  of  distinct  products  of  disturbances,  so  that 


"t*  ^  "t'  ^  ^'    \ 


Let  S'  =  S'R",  so  that  's'{V^'   x   U^ ' )  =  's'  U^  and  S'  is  a  L  x  (1/2)M(M+1) 
selection  matrix-   It  then  follows  from  Richard  (I975)and  E[S'U,]  =  0 
that 

(B.1)  V22  =  e(s'U^U^'S  ]  =  Var(S'U^)  =  S'Var(U^)S  =  2S'Q(Z  x  S)Q'S. 

For  Z   nonsingular,  X  0Z    is  also  nonsingular,  and  since  Q  has  full  row 

rank  and  S  has  full  column  rank,  it  follows  that  V^p  is  nonsingular,  and 

consequently  that  V  is  nonsingular.   Also,  from  Appendix  A,  it  follows 

that  if  Z  is  nonsingular,  so  is  J 

a  a 


For  normally  distributed  disturbances,  V   and  V   are  both  zero 
matrices,  so  that 
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(B.2)  G'VG  =  G^'V^J   G^  +  G2'v22  °2  " '^'  ^^'^©^^  "^  *  ^2*^22  °2* 

We  now  proceed  to  calculate  G^'V^p  Gp.  Using  equations  (4-26),  the 
definition  of  S,  and  Lemma  Al(v), 


(B.3)   G^  ='S  (E  +  l)(lj^Z;)B'  =  2S'R'Q(lj^s)B'  =  2S'Q(Ij^Z)B' 


-1    -1  -1 

Let  F  =  R(E  ®£   )R',  so  that  hy  eqution  (A. 15)  we  have  F   =  Q(s  0S ) 

Q*.  Then  by  equations  (B.I)  and  (B.3) 


(B.4)   G2*V22G2  =  2BiZ~'^Q  1^)   R'F"''s  (S*  F"''s)'''  S'F"''r(Z"^@  Ijj)B' . 

Then  from  equations  (B.5),  (B.1),  and  (A. 18)  the  conclusion  will  follow 
if 


(B.6)   F"^S  (S'F"''s)"''  S'F"^  =  F"^+  S(S'FS)""'s' 


Note  that  S  selects  the  unrestricted  components  of  a^   and  S  the 
restricted.  Therefore,  rank  (S)  +  rank  (S^)  =  rank  (F)  and  S^'S  =  0,  and 

(B.7)    F'^/^  S(s;f'^S)"^S'F"''/^  +  F^/^S  (S'FS)"^S'F^/^  =  I 

since  the  matrix  on  the  left-hand  side  of  equation  (B.?)  is  the  sum  of 
two  orthogonal ,  indempotent  matrices ,  and  the  sum  of  their  ranks  equals 
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their  dimension.  The  conclusion  follows  by  substituting  equation  (B.7) 
into  equation  (B.5). 

The  following  Lenuna  will  be  useful  in  obtaining  proofs  of  the 
identification  results  of  Section  5-     Suppose  for  the  moment  that  G  is 
square.   For  a  particular  assignment  of  residuals  as  instruments,  which 
is  indexed  by  p=1,...2  ,  let  C  =  diag(C  .,...,  C  „) 

Lemma  B1  :  For  some  2  -tuple  of  positive  integers  {Z.,...,Z„L) 


Jlp 


det(G)  =  Z^,  (-1)   'det(C  ). 
p=1  p 


Proof:   Let  the  rows  of  G  be  denoted  by  s  ,  k=1,...,L.   Each  k 
corresponds  to  a  restriction  a . .=  0  for  some  i^  j.  Further,  each  s,  is  a 

sum  of  two  1xq  vectors,  s   +  s   where  s   has  plim(u'.X./T)  for  the 

subvector  corresponding  to  6 .  and  zeros  for  all  other  subvectors  and  s  . 

1  KJ 

has  plim(u. 'X  ./T)  for  the  subvector  corresponding  to  6  .  and  zeros  for  all 

other  subvectors.  We  can  identify  s   with  an  assignment  of 

residual  j  to  equation  i  and  s,  .  with  an  assignment  of  residual  i  to 

kj 

equation  j.  We  have 


-  G 


^1i  ;  ^1j 


s,  .   +  s,  . 
Li      Lj 
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where  we  drop  k  subscript  on  i  and  j  for  notational  convenience.  For 
each  of  the  s  distinct  assignments,  indexed  by  P,  let 


G 
P 


(lpjg)N)'l) 


8 
P 


where  s  is  the  Lxq  matrix  which  has  its  kth  row  s,  .  if  u .  is  assigned  to 
P  ki     J       ^ 

equation  i  or  s,  .  if  u.  is  assigned  to  equation  j.   The  determinant  of 

a  matrix  is  a  linear  function  of  any  particular  row  of  the  matrix.   It 
follows  that  if  L  =  1 


(5.18)   det(-G)  =  det(-G^)  +  detC-G^). 


2^ 
Then  induction  on  L  gives  det(-G)  =   S.  det(-G  ) 

P=1       P 


Now 


consider  G  for  each  p.  The  matrix  (l„@N)'l)  is  block  diagonal, 


where  the  column  partition  corresponds  to  6. for  i=1,...,M,  and  the  ith 

diagonal  block  is  plim  Z'X./T.  Further  the  kth  row  of  s  consists  of 

zeros  except  for  the  subvector  corresponding  to  6  .where  plim(u  .'X./T) 

J-  J  "^ 

appears.  Then  by  interchanging  pairs  of  rows  of  G  ,  we  can  obtain  N 

from  G  .  That  is ,  N  =  E  G  ,  where  E  is  a  product  of  matrices  which 
P  P   P  P        P 

interchange  a  pair  of  rows  of  G  .  Note  that  E  satisfies  E  'E  =1, 

P  P  P  P 
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SO  that  det(E  )  =  (-1 )  p  for  i     equal  to  1  or  2.  It  follows  that 

det(G  )  =  (-1)  p  det(C  ).   Then  since  det(-G)  =  (-l)'^det(G)  and  for  each 

p  detC-Ti  )  =  (-1)  ^p  detC5  )  det(G)  =  E^  detCS  )  =  ^  (-1)  *p  det(N  ) 
P  P         p=1       P    p=1  P 


Proof  of  Theorem  5-1 : 

If  rank(G)  =  q,  then  there  exists  a  q- dimensional  square  submatrix 

of  G,  denoted  by  G,  which  is  nonsingular.  The  matrix  G  is  obtained  by 
deleting  MK+L-q  rows  of  G.  Each  row  of  (l  x  N)D  which  is  deleted 
corresponds  to  ignoring  an  a  variable  in  Z  when  considering  instruments 
for  an  equation  i.   For  each  i  let  Zi  denote  the  predetermined  variables 

which  remain  as  instruments  for  equation  i  after  forming  G.   Each  row  of 

Gp  deleted  corresponds  to  ignoring  a  covariance  restriction.  Let  k  =  1, 

...,L  index  the  remaining  covariance  restrictions.  For  each  asignment  of 

disturbances  as  instruments,  indexed  as  before  by  p,  from  the  remaining 

covariance  restrictions,  let  W  .  =  (Z. ,  U  . )  be  the  matrix  of 

pi     1   pi 

observations  on  the.  instrumental  variables  for  equation  i,  and  let 

C"  .  =  plim(W  .'X/T)  and  C"  .  =  diag(c"  ,,..., C",,).   Then  from  Lemma  B1  it 
pi        pi  pi        p1      pM 

_     pL     A^    _ 
follows  that  det(G  =  S    (-1)  ^det(C  ). 

P=1  P 

Then  G  non-singular  implies  det(C  )  ^  0  for  some  p  and  consequently 

***  Air 

rank(C_  )  =  q.   Since  C_  is  block  diagonal, 
P  P 
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M       _  _ 

(B.8)   2  rank  (C  )  =  rank(C  )  =  q. 

i=1       pi         p 


Bach  C   has  q  columns  so  that  rank(C  )  <  q  for  i=1,...,M,  and 

-.      i  -.     i 

pi  pi 

consequently  equation  (B.8)  implies  rank  (C  )  =  q  •  Now  let  m*  be  an 

— .     i 

assignment  of  disturbances  as  instruments  such  that  the  covariance 

restrictions  indexed  by  k  =  1,...,L  have  disturbances  assigned  as  the 

assignment  indexed  by  p  and  for  the  other  covariance  restrictions  the 
disturbances  are  assigned  in  any  feasible  fashion.  Then  for  i=1,...,M 

q  >  rank(C   )  =  rank(plim  [Z:U   ] "X  /T)  >  rank(plim  [z  :U_  ] 'X  /T)=  q 
1        p*i  p*i   i  i  pi   i     i 


so  that  q,  =  rank  (C  ».) 
^i         p*i 


Proof  of  Theorem  4.5:  This  follows  as  in  the  algorithm  for  the 
assignment  condition  of  Geraci  (1977). 

Proof  of  Lemma  5«4:  We  drop  the  p  subscript  for  notational  convenience. 

We  also  assume  i=1 .  Note  that  the  first  column  of  Z.    consists  entirely 

of  zeros,  since  to  qualify  as  an  instrument  for  the  first  equation  a 

disturbance  u.  must  satisfy  E(u. u.)  =  a,  .   =  0.  Let  e.  be  an  M 
J  1  J     1j  1 

dimensional  unit  vector  with  a  one  in  the  first  position  and  zeros 
elsewhere.  Then  ^k.=  0   and  the  covariance  restrictions  imply  Fe.  =  0 
where  F  =  (A'(j.  '  ,S  ^  ' ) ' .  Note  that  rank  (FB"^  =  rank(F).  Also  FB"S^  = 
Fe.  =  0  where  B.  is  the  first  column 
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of  B,  80  that  the  first  column  of  F  is  a  linear  combination  of  the  other 
columns  of  F  by  B  =  1.   Let  r.  be  the  rows  of  T   corresponding  to  the 
excluded  predetermined  variables.   Then  (t)AB   =[e  ',(B'')  F.']'  where 

E.  is  an  (M-1-r.)xM  matrix  for  which  each  row  has  a  one  in  the  position 
corresponding  to  a  distinct  excluded  endogenous  variable  and  zeros 
elsewhere.  Let  (B  )^  be  the  columns  of  B   corresponding  to  included 


right-hand  side  endogenous  variables.  Note  that  FB  = 


^1  ^ 
1^1  ^ 


Then  row  reduction  of  FB   using  the  rows  of  E. ,  and  the  fact  that  the 
first  column  of  FB   is  a  linear  combination  of  the  other  columns  imply 

r 

(B.9)    rank(FB"^)  =  rank 


1 


(B-^^ 


S,   (B-^)^ 


+  M-1-r  . 


Now  consider  N  .   Note  that  for  any  j^ 1 ,  plim  u.'X  /T  =  [  plim(u  . 'Y  /T) , 
plim(u^'Z^/T)]  =  [plim(u^'V^/T),  Oj  =[z  ^(b"^  ^  ,0^]  , 

where  0  is  a  Ixs.  vector  of  zeros  and  S  .  is  the  jth  row  of  Z •  By  C  non- 
singular 


(B.10)   rank(C^)  =  rank  ([q   J  ]  [  plim  ^(u^*X^/T)]) 


By  column  reduction,  using  the  columns  of  [l  '  0  ']',  equation  (B.10) 
implies 
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r.(B~  ), 


(B.11)      rank(C^)  =  rank  [j,^  ^^'^  j^]  +  s^ 


Then  equations  (B.9)  and  (B.11)  imply  M-l-raiik(F)  =  q  -rank(C  ),  from 
which  the  conclusion  of  the  proposition  follows. 


Proof  of  Theorem  5 •6:   This  proof  follows  closely  the  proof  of  Lemma  5.4- 
Let  F  =  diag((j)^,...,(|.jj,  'S')  •  (ljjg)A',  I^j  ^Z(E+l)) '  . 


Post-multiplication  of  F  by  I„  (x) B   and  row  reduction  using  E. , 
i»1,...,M  as  in  the  proof  of  Lemma  5«4-  gives 


MM 
(B.12)   rank"^  (I^B"')  =  rank(G)  -  S  s  +  M  -M-  E  r  =  (rank(G)-q) 

i=1  i       1=1  i 

+  M^-M. 
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